Goto

Collaborating Authors

 stochastic method



Non-asymptotic Analysis of Stochastic Methods for Non-Smooth Non-Convex Regularized Problems

Neural Information Processing Systems

Stochastic Proximal Gradient (SPG) methods have been widely used for solving optimization problems with a simple (possibly non-smooth) regularizer in machine learning and statistics. However, to the best of our knowledge no non-asymptotic convergence analysis of SPG exists for non-convex optimization with a non-smooth and non-convex regularizer. All existing non-asymptotic analysis of SPG for solving non-smooth non-convex problems require the non-smooth regularizer to be a convex function, and hence are not applicable to a non-smooth non-convex regularized problem. This work initiates the analysis to bridge this gap and opens the door to non-asymptotic convergence analysis of non-smooth non-convex regularized problems. We analyze several variants of mini-batch SPG methods for minimizing a non-convex objective that consists of a smooth non-convex loss and a non-smooth non-convex regularizer. Our contributions are two-fold: (i) we show that they enjoy the same complexities as their counterparts for solving convex regularized non-convex problems in terms of finding an approximate stationary point; (ii) we develop more practical variants using dynamic mini-batch size instead of a fixed mini-batch size without requiring the target accuracy level of solution. The significance of our results is that they improve upon the-state-of-art results for solving non-smooth non-convex regularized problems. We also empirically demonstrate the effectiveness of the considered SPG methods in comparison with other peer stochastic methods.





Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The authors present a framework for online optimization of non-decomposable loss functions (those that do not decompose as a sum over data points). Their idea is to write L_t(w) = l(w, x_{1:t}) - l(w, x_{1:t-1}). While L_t(w) in general will not be convex, its cumulative sum will be, which turns out to be enough for the FTRL analysis to hold. However, we still need to establish stability properties for L_t(w), which is harder than for decomposable loss functions.




A single-loop SPIDER-type stochastic subgradient method for expectation-constrained nonconvex nonsmooth optimization

Liu, Wei, Xu, Yangyang

arXiv.org Artificial Intelligence

Many real-world problems, such as those with fairness constraints, involve complex expectation constraints and large datasets, necessitating the design of efficient stochastic methods to solve them. Most existing research focuses on cases with no {constraint} or easy-to-project constraints or deterministic constraints. In this paper, we consider nonconvex nonsmooth stochastic optimization problems with expectation constraints, for which we build a novel exact penalty model. We first show the relationship between the penalty model and the original problem. Then on solving the penalty problem, we present a single-loop SPIDER-type stochastic subgradient method, which utilizes the subgradients of both the objective and constraint functions, as well as the constraint function value at each iteration. Under certain regularity conditions (weaker than Slater-type constraint qualification or strong feasibility assumed in existing works), we establish an iteration complexity result of $O(\epsilon^{-4})$ to reach a near-$\epsilon$ stationary point of the penalized problem in expectation, matching the lower bound for such tasks. Building on the exact penalization, an $(\epsilon,\epsilon)$-KKT point of the original problem is obtained. For a few scenarios, our complexity of either the {objective} sample subgradient or the constraint sample function values can be lower than the state-of-the-art results by a factor of $\epsilon^{-2}$. Moreover, on solving two fairness-constrained problems, our method is significantly (up to 466 times) faster than the state-of-the-art algorithms, including switching subgradient method and inexact proximal point methods.


Non-asymptotic Analysis of Stochastic Methods for Non-Smooth Non-Convex Regularized Problems

Neural Information Processing Systems

Stochastic Proximal Gradient (SPG) methods have been widely used for solving optimization problems with a simple (possibly non-smooth) regularizer in machine learning and statistics. However, to the best of our knowledge no non-asymptotic convergence analysis of SPG exists for non-convex optimization with a non-smooth and non-convex regularizer. All existing non-asymptotic analysis of SPG for solving non-smooth non-convex problems require the non-smooth regularizer to be a convex function, and hence are not applicable to a non-smooth non-convex regularized problem. This work initiates the analysis to bridge this gap and opens the door to non-asymptotic convergence analysis of non-smooth non-convex regularized problems. We analyze several variants of mini-batch SPG methods for minimizing a non-convex objective that consists of a smooth non-convex loss and a non-smooth non-convex regularizer.